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ABSTRACT 


The Defense Language Institute Foreign Language Center 
(DLIFLC) trains students in various foreign languages and 
dialects for the Department of Defense (DOD). The majority 
of students are first-term enlistees in the basic program. 
This study uses classification trees and logistic 
regression to understand the military, academic and 
personal characteristics that influence first-term success 
after successfully completing DLIFLC training. Success was 
defined as completing a first-term enlistment contract and 
maintenance of language proficiency. DLIFLC management was 
interested in the difference in success for individuals 
that graduated DLIFLC via the different training pipelines. 
Students graduate by completing the program as originally 
assigned, or by recycling, relanguaging or taking DLPT 
enhancement training multiple times and in multiple 
combinations due to various academic, administrative or 
other reasons. 63% of students graduated. Only 45% of 
those that graduated were successful post-DLIFLC. Results 
identified several factors influential in predicting 
success; the factors were service affiliation, contract 
lengths and gender. Training pipelines were slightly 
influential. Individuals in the Army had the worst odds of 
success. Contract lengths greater than four years had 
lower odds of success. Males had higher odds of success 
than females. 
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EXECUTIVE SUMMARY 


The Defense Language Institute Foreign Language Center 
(DLIFLC) at the United States Army Presidio of Monterey, 
California trains over 3,000 personnel in 23 languages and 
several dialects annually. Personnel include officers and 
enlisted members of all branches of the armed services, as 
well as personnel from several civilian agencies and 
international students. The majority of personnel who 
enter the DLIFLC are first-term enlistees with less than 
two years of service and enroll into the basic course of 
instruction. While trying to successfully complete DLIFLC, 
a student may complete the program as originally assigned, 
recycle into the same language in a later class, relanguage 
into a different language (usually a language of lesser 
difficulty), drop from the program or require Defense 
Language Proficiency Test (DLPT) enhancement training. 
There are a number of reasons (academic, medical, other, 
etc.) classified by DLIFLC that affect a student's ability 
to complete the program as originally assigned. This study 
is interested in the military, academic and personal 
factors that influence post-DLIFLC success of first-term 
enlistees from the basic course of instruction who entered 
DLIFLC during fiscal years 1997 - 2000 and graduated. 

To have been considered a graduate of the DLIFLC, an 
individual had to successfully complete his or her course 
of instruction and meet the requirements of the Defense 
Language Proficiency Test (DLPT). There are numerous 
training pipelines that a student could traverse in order 
to graduate. Training pipelines were based on whether a 
student graduated the program as originally assigned. 
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recycled, relanguaged or required DLPT enhancement 
training. Each of these steps could be repeated multiple 
times and in multiple combinations. Each distinct path was 
considered a new training pipeline. There were 56 
pipelines established in the data gathered. These 
pipelines were collapsed into 8 pipelines in which 
meaningful analysis could be accomplished. This study was 
particularly interested in the influence of these pipelines 
on post-DLIELC success. 

Post-DLIELC success was defined as an individual 
completing his or her contractual enlistment obligation and 
maintaining his or her language proficiency. Individuals 
were considered to have completed their enlistment contract 
if they did not leave the service up to three months prior 
to the end of the contract. Maintenance of language 
proficiency was determined by the receiving of Eoreign 
Language Proficiency Pay (ELPP) up to six months prior to 
leaving the service. An individual was considered a 
success if both of these conditions were met. 

Descriptive statistics were first calculated to better 
understand the data population. Only 63% of the students 
this study was concerned with graduated the DLIELC. Out of 
those individuals who graduated the DLIELC only 45% were 
subsequently successful. This data was broken down further 
and statistical significance determined for success rates 
among different training pipelines, service, gender and 
contract lengths. Inferential statistics showed that there 
were statistically significant differences between 
services, gender, AEQT scores and contract lengths. The 
statistics showed that there is some interaction between 
service and contract length. In addition, it was 
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discovered that the majority of observations for AFQT 
scores below 75 were missing AFQT data. Because of this 
fact, no meaning can be attributed to the findings 
concerning AFQT scores. 

The classification tree method was used to better 
understand the influence of all the independent variables 
looked at in this study. A classification tree was grown, 
cross-validated and pruned to produce a tree that did an 
adequate job of classifying observations. The tree also 
provided useful information about how to build the numeric 
independent variables into useful categorical variables. 

Finally, logistic regression was used to further 
analyze the influence of all independent variables. After 
assessing the "goodness-of-fit" and adequacy of the 
different models produced, a final model was decided upon. 
This model provided further insight into which factors were 
most important in influencing post-DLIFLC success. 

This study found that training pipeline, service 
affiliation, contract lengths, citizenship, gender and AFQT 
scores were all common factors in predicting success. 
Though training pipelines had some minor influence, they 
were not as distinguishable as the other factors. Contract 
lengths were very influential in determining success. 
Individuals who had contract lengths of greater than four 
years were 0.08-0.56 as likely to succeed as individuals 
who had contracts of four years or fewer. In terms of 
service affiliation, being in the Army had the most 
negative impact on success while the Air Force had the most 
positive effect, followed by the Marine Corps. Service 
affiliation is noteworthy in that the majority of students 
that pass through DLIFLC are in the Army or Air Force. 
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Males were more likely to be successful than females. 
Males had 1.38 times greater odds of success than females. 
Though AFQT scores were found to be significant in 
explaining success, because of the fact that the majority 
of observations below the score of 75 were missing AFQT 
data, no conclusions can be drawn from the analysis of 
AFQT. The variable AFQT was left in the model because the 
model had better "goodness-of-fit" with the variable than 


without it. 



INTRODUCTION 


The Defense Language Institute Foreign Language Center 
(DLIFLC) at the United States Army Presidio of Monterey, 
California trains over 3,000 personnel in 23 languages and 
several dialects annually. While trying to successfully 
complete training at the DLIFLC, a student may complete the 
program as originally assigned, recycle into the same 
language in a later class, relanguage into a different 
language (usually a language of lesser difficulty), drop 
from the program or require additional training after 
having taken the end-of-program Defense Language 
Proficiency Test (DLPT). There are a number of reasons 
classified by DLIFLC (academic, medical, other, etc.) that 
document a student's progression through the program 
assigned. The paths through the DLIFLC based on reason 
classifications are considered training pipelines for this 
study. 

This study analyzes the various training pipelines 
through the DLIFLC and other military, academic and 
personal characteristics to determine their effects on 
post-DLIFLC success of first-term enlistees. Success is 
defined as completion of initial enlistment contract 
obligation and maintenance of foreign language proficiency. 
Models are developed using regression classification trees 
and logistic regression techniques to better understand the 
factors that are related to post-DLIFLC success and to be 
able to adequately predict success. This information will 
assist the DLIFLC in beginning to address the issue of 
return on investment for each of the training pipelines. 
Additionally, it will allow the individual services to 
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identify and intervene for those service members who are at 
greater risk for attrition and/or loss of linguistic 
proficiency. 


A. BACKGROUND 

1. Mission of DLIFLC 

The mission of the DLIFLC is to educate, sustain, 
evaluate and support foreign language specialists under 
guidelines of the Defense Foreign Language Program. The 
DLIFLC trains over 3,000 officer and enlisted members from 
the Army, Navy, Air Force and Marine Corps and select 
civilian and international personnel annually. Instruction 
is provided in 23 languages and several dialects through 31 
language departments and the Emerging Languages Task Force 
(ELTF). ( WWW.dliflc.edu ) All of these languages and 
dialects are subdivided into four difficulty categories. 
The categories are numbered from I to IV with IV being the 
most difficult languages to learn for English speakers. 
Each category is associated with a corresponding length of 
study for initial basic language training. Category I 
requires 25 weeks. Category II 34 weeks. Category III 47 
weeks and Category IV 63 weeks. Category IV languages 
include Arabic, Chinese, Korean and Japanese. 

The DLIELC provides training at the basic, 
intermediate, advanced and specialized levels. The 
majority of students are enlisted and take the basic 
program of study during their first-term of enlistment. 


2 



2. 


DLIFLIC Pre-Qualifications 


In order to qualify for language study, a student must 
successfully pass the Defense Language Aptitude Battery 
(DLAB). Prerequisites include a minimum score of 85 for 
Category I, 90 for Category II, 95 for Category III and 100 
for Category IV languages. Though these are minimum 
requirements, there are exceptions to these rules. There 
are various reasons for these exceptions (native speaker, 
service requirements, etc.). 


3. Successful Completion of DLIFLC 

When students arrive at the DLIFLC they are assigned a 
program of study with the number of weeks of training 
corresponding to the category of their language. In order 
to successfully complete the DLIFLC program, a student must 
complete the program of study with at least the minimum 
grade point average. After completion of the course of 
study, the student must then take and pass the DLPT. The 
DLPT is divided into three sections consisting of 
listening, reading and speaking. The proficiency standards 
tested by the DLPT are based on the Interagency Language 
Roundtable (ILR) proficiency level descriptions. 
( WWW .govtilr. org ) Descriptions of these standards are 
provided in Table 1.1. 

A student must meet the requirement of 2/2/1+ on the 
DLPT. This requirement indicates a proficiency of level 2 
in listening and reading and a 1+ in speaking. A " + " 
indicates a proficiency above the base standard, but not at 
the next level. 
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Table 1.1 Interagency Language Roundtable Proficiency 


Standards 


Level 

Function/Tasks 

Context 

Accuracy 

3 

Support Opinions 

Hypothesize 

Explain 

Deal with Unfamiliar 
Topics 

Practical 

Abstract 

Special 

Interests 

Errors never 

interfere with 

communication and 
rarely disturb 
the native 
speaker 

2 

Narrate 

Describe 

Give Directions 

Concrete 

Real- 

World 

Factual 

Intelligible even 
if not used to 
dealing with non¬ 
native speaker 

1 

Q and A 

Create with the Language 

Everyday 

Survival 

Intelligible with 
effort and 
practice 

0 

Memorized 

Random 

Unintelligible 


DLIFLC Command Briefing Slides (Anderson, 1997) 


Using data from the DLIFLC for fiscal years 1990- 
present, the average rate for successful completion for all 
enlisted personnel in the basic course was 56%. Figure 1.1 
illustrates the successful completion rates from 1990- 
present. Figure 1.1 shows that there was a significant 
increase in completion rates between 1990-present. Rates 
increased from 43% to 63% over this time period. The 
reason for the increase in completion rates is unknown. 
The data for this study included individuals who entered 
from 1997-2000. This period corresponds to the sharpest 
increase in completion rates. 
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- Passed DLPT 

_r I UCL 

Average = .56 
1 LCL 


Figure 1.1 Successful Completion Rates 1990-Present 

Compiled from data from DLIFLC. Confidence Intervals are 
for the overall average of 56%, but are not relevant. 


4 . DLIFLC Training Pipelines 

Based on data gathered from the DLIFLC from 1990- 

present only 45% of enlisted personnel in the basic program 

successfully completed DLIFLC in the originally allotted 

time. The remaining 11% of students who successfully 

completed the DLIFLC program did so through various means. 

Some students recycled into a later class of the same 

language. There are a number of students who relanguaged 

into a different language (normally a lower category 

language) in a later class. There are a few students who 

required DLPT enhancement training after completing their 

program of study and failing to meet the minimum 

requirements on the DLPT the first time. Students can 

recycle, relanguage or take DLPT enhancement training 

multiple times and in numerous combinations. Each of the 
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routes that lead to successful completion of the DLIFLC is 
considered a distinct training pipeline for this study. 


5. Defining Post-DLIFLC Success 

Most students who enter the DLIFLC basic program are 
junior enlisted personnel serving their first term of 
enlistment. Normally they have just completed recruit 
training and most have not completed any type of technical 
training. In addition, they have not been to their first 
operational unit. This study will look at the success or 
failure of junior enlisted (E-4 and below) personnel, who 
entered the DLIFLC between fiscal years 1997-2000 and 
successfully completed training and who are serving their 
first term of enlistment. 

Success for these individuals is defined as completing 
their first term of enlistment and maintaining their 
language proficiency. An individual will be deemed to have 
met his or her first term obligation if he or she was not a 
loss prior to three months to the end of his or her 
obligation (based on their enlistment contract) . This is 
due to the fact that services routinely use this time as a 
force-shaping tool, especially near the end of the fiscal 
year. Maintenance of language proficiency will be 
determined by whether or not an individual received Foreign 
Language Proficiency Pay (FLPP) at least six months prior 
to the end of his or her first term obligation. Six months 
was used due to the fact that an individual must take the 
DLPT each year to continue to receive FLPP. We assume that 
many of the individuals who have already decided to leave 
at the end of their obligation may not believe the extra 
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pay for six months or less is worth the time and effort to 
pass the DLPT. Up to the six-month point it is assumed 
that the extra pay is enough of an incentive to continue to 
study for and pass the DLPT (FLPP pay at the maximum $200 
per month for an E-3 is approximately 15% of pay after 
taxes) . 


B. THE PROBLEM 

Previous studies conducted at the DLIFLC have 
concentrated on attrition at the DLIFLC. To date there 
have been no studies that have linked an individual's 
performance at the DLIFLC and his or her success after 
leaving the DLIFLC. This study tries to bridge that gap 
and provide valuable information to both the DLIFLC and the 
individual services. 

Students who do not successfully complete DLIFLC as 
originally assigned are costly to the organization and to 
individual services. The DLIFLC budgets for a certain 
number of students in each language for each fiscal year. 
When students are not able to complete this training as 
assigned, they are either dropped from the program or are 
assigned another training pipeline that may not have been 
properly budgeted for. The DLIFLC is not reimbursed for 
such expense. In addition, the services lose valuable time 
and resources when individuals do not graduate on time. 
Most have to have follow-on schools rescheduled causing 
further delay in reporting to their first operational unit 
where the service first sees a return on the large 
investment in this individual. This study will attempt to 
identify if there is a difference in the post-DLIFLC 


7 



success rates (as defined above) among the various training 
pipelines at the DLIFLC and other military, academic and 
personal factors. Identifying factors influential on 

success and developing an accurate prediction model will 
enable the DLIFLC to begin to address the return on 
investment for each pipeline and allow the services to 
identify those individuals who are at higher risk for 
attrition and/or not maintaining their language 
proficiency. 

Using data from the DLIFLC and the Defense Manpower 
Data Center (DMDC) in Seaside, CA, this study looks at 
junior enlisted personnel in their first term of enlistment 
who entered the DLIFLC in fiscal years 1997-2000 and 
successfully completed DLIFLC training. Training pipelines 
are defined and post-DLIFLC success is established. 
Inferential statistics will be developed to determine if 
there is a significant difference in post-DLIFLC success 
rates among training pipelines and other factors. The 
training pipeline and other variables deemed important 
and/or identified in previous studies as important in 
service attrition are considered in classification trees 
and logistic regression analysis to develop a prediction 
model. 


C. ORGANIZATION OF THESIS 

This thesis is organized into five chapters. Chapter 
II consists of reviews of literature concerning attrition 
studies conducted at the DLIFLC and within the individual 
services. Chapter III describes the data being used and 
the descriptive statistics developed. Chapter IV is a 



description of the methodology, analysis and results. 
Chapter V contains the conclusions and recommendations for 
further research. 
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II. LITERATURE REVIEW 


Most research conducted previously at the DLIFLC has 
looked at the factors that influence attrition while at the 
DLIFLC. These studies do not provide any support for this 
study except to demonstrate the link between performance on 
the DLAB and success at the DLIFLC. 

Attrition studies employed by the services provide 
some information in terms of the factors that have been 
proven to be significant in first-term service attrition. 
These factors will be important to this study. 

A. LANGUAGE SKILL CHANGE PROJECT 

The Language Skill Change Project (LSCP) was a study 
conducted by the DLIFLC Research and Analysis Division and 
the U.S. Army Research Institute for the Behavioral and 
Social Sciences (ARI). This was a longitudinal study begun 
in 1987 that tracked a number of Army students through 
DLIFLC training and the individual's initial tour of duty. 
There were various objectives to this study; the objective 
important to this study was in identifying predictors of 
success for language learning at DLIFLC. 

1. The Prediction of Language Learning Success at 
DLIFLC 

LSCP Report II, "The Prediction of Language Learning 
Success at DLIFLC," analyzes factors that are related to 
success at the DLIFLC. Success is defined as completing 

the course of study satisfactorily and meeting the minimum 

11 



requirements on the DLPT. The study found a number of 
factors that were important in predicting success. Most 
important to the present study is that language aptitude, 
as measured by the DLAB, was a significant factor in 
predicting success. (O'Mara, et al . , 1990) 

2. Training Approaches for Reducing Student 
Attrition from Foreign Language Training 

LSCP Report III, "Training Approaches for Reducing 
Student Attrition from Foreign Language Training," analyzed 
potentially modifiable factors in addressing academic 
attrition. This study confirmed LSCP Report II's 

conclusion that the DLAB was a significant factor in 
predicting success at DLIFLC. A DLAB score of 100 appeared 
to be a critical value in determining success. (O'Mara, et 
al., 1994) 

B. OTHER DLIFLC ATTRITION STUDIES 

There have been a number of other studies concerning 
DLIFLC attrition. Two Naval Post Graduate School (NPS) 
students separately conducted thesis research on this 
subject. Robert E. Anderson carried out research in 1997 
entitled, "Study of Initial Entry Student Attrition from 
Defense Language Institute Eoreign Language Center." This 
study looked at data from fiscal years 1994-1996 and 
analyzed factors relating to success at the DLIELC. A 
binary tree classification method was used to identify the 
best set of predictors. As with the LSCP Report II & III, 
this study found that the DLAB was a significant predictor 


12 



of success. (Anderson, 1997) Additionally, Chin Han Wong 
performed a study in 2004 called, "An Analysis of Factors 
Predicting Graduation of Students at Defense Language 
Institute Foreign Language Center." Wong used logistic 
regression techniques to examine factors affecting success 
at DLIFLC. His research confirmed all other studies that 
the DLAB is a reliable predictor of success at DLIFLC. 
(Wong, 2004) 

C. SERVICE ATTRITION STUDIES 

Since the inception of the all-volunteer force, the 
military has been concerned with first-term attrition. 
Attrition has traditionally been defined as not completing 
contractual obligation of enlistment. Over the last three 
and a half decades attrition has hovered around 30%. This 
number has fluctuated to as high as 40%, but is normally 
very close to 30%. This attrition implies a huge cost to 
the military. Recruiting and training are very expensive; 
it is in the services' best interests to be able to 
accurately predict attrition in order to modify recruiting 
techniques and to identify and intervene for individuals 
with higher risk of attrition. To this end, there has been 
a large volume of research connected with military 
attrition. This is important in that this research 
identifies factors that might be important in predicting 
post-DLIFLC success in the present study. 
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1 . 


Determining Characteristic Groups to Predict Army 
Attrition 


"Determining Characteristic Groups to Predict Army 
Attrition" was a study conducted in 1999 to identify 
factors that would aid in predicting attrition for the 
Army. The Army's Enlisted Loss Inventory Model (ELIM) had 
not been considered satisfactory; this caused the Office of 
the Deputy Chief of Staff, Personnel (ODCSPER) to consider 
other alternatives to the ELIM. The study used 
Classification and Regression Tree techniques to analyze 
the factors predicting attrition and develop improved c- 
groups. The new c-groups were able to outperform the old 
in terms of misclassification rates. This study found that 
gender was the most important factor. Other variables that 
were found important were race, length of service 
obligation. Armed Eorces Qualification Test (AEQT) scores 
and level of education. (Buttrey and Larson, 1999) 


2. Analysis of Early Military Attrition Behavior 

"Analysis of Early Military Attrition Behavior" was a 
study conducted in 1984 by RAND. This study sought to 
incorporate both military personnel record data and data 
from the 1979 Survey of Personnel Entering Military 
Service. This study revealed that high school graduation 
status, age and pre-enlistment work history were all 
significant factors in predicting early attrition. In 
particular, non-high school graduation was the single best 
predictor for early attrition. (Buddin, 1984) 
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3. What Characterizes Successful Enlistees in the 
All-Volunteer Force: A Study of Male Recruits in 
the Navy 

"What Characterizes Successful Enlistees in the All- 
Volunteer Force: A Study of Male Recruits in the Navy" was 
a study conducted in 1992 to ascertain the factors that 
affect attrition among male service members in the Navy. 
An analysis using logistic regression techniques was 
utilized to determine the factors that were important in 
influencing attrition. This study found that high school 
graduation status. Delayed Entry Program (DEP) time, race 
and AFQT scores were all important factors. (Cooke & 
Quester, 1992) 

D. SUMMARY 

Though there have been no studies that have linked the 
DLIFLC experience and post-DLIFLC success, there has been 
a great deal of research on DLIFLC attrition and service 
attrition that has provided valuable insight into factors 
affecting attrition. Attrition studies at the DLIFLC 
have convincingly shown that the DLAB is a significant 
factor in determining DLIFLC success. Though other 
factors considered were also found to be important 
predictors, the DLAB is critical in that it can be 
assumed to be a reliable cognitive screen for language 
aptitude. Service attrition studies have yielded factors 
that are consistent in predicting attrition. The factors 
gender, age, race, length of service contract, education 


15 



level and AFQT scores were all significant predictors of 
attrition and are important in beginning to understand 
post-DLIFLC success. 
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III. DATA AND DESCRIPTIVE STATISTICS 


A. DATA 

The DLIFLC maintains a database of student information 
dating back to 1990. There are over 50,000 entries in this 
database. The data contain various personal and 
professional statistics on each student (SSN, Date of 
Birth, Language, Start Date, DLPT scores, etc.). The DMDC 
maintains multiple databases on all Armed Forces personnel. 
These databases contain personal and professional 
statistics on each service member (SSN, Date of Birth, Term 
of Service, Total Active Federal Military Service, Foreign 
Language Proficiency Pay, etc.) . The data for this study 
was obtained from the DLIFLC and then merged with data from 
the DMDC corresponding to each individual. 

1. DLIFLC Data 

The majority of training at the DLIFLC involves first- 
term enlistees (less than 1-2 years of service/E-4 and 
below) who are enrolled in the basic program of instruction 
in the foreign language to which they have been assigned. 
This study was concerned with individuals who began 
instruction during fiscal years 1997-2000. 

The data obtained from the DLIFLC contained all 
entries for the basic course of study since 1990. It 
should be noted that a student could have multiple entries 
in the database due to recycling, relanguaging and DLPT- 
enhancement training. Once the data was sorted for entry 
date and for junior enlisted (E-4 and below) status and 
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multiple entries collapsed to a single entry there were 


6,162 distinct observations. These observations were then 
sorted based on successful completion of DLIFLC training. 
This yielded 3,868 observations (63% completion rate). 

Once the data were pared to 3, 868 observations, they 
were examined to determine the multiple training pipelines 
that the students utilized to successfully complete DLIFLC. 
There were 56 distinct training pipelines used by these 
students. Table 3.1 describes each of these pipelines. 
Table 3.2 shows definitions for the codes in Table 3.1. 


Table 3.1 Graduate Training Pipelines 




El 

El 

E2 

E2 

E3 

E3 

E4 

E4 

E5 

E5 


TP 

# 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

FP 

PI 

2959 

Grad 

★ 









PI 

P2 

32 

D-Grad 

★ 









Drop 

P3 

6 

RC 

A/Z 

D-Grad 

★ 







Drop 

P4 

319 

RC 

A 

Grad 

★ 







P2 

P5 

65 

RC 

Z 

Grad 

★ 







P3 

P6 

102 

RC 

J 

Grad 

★ 







P4 

P7 

7 

RC 

H 

Grad 

★ 







P3 

P8 

104 

RL 

A 

Grad 

★ 







P5 

P9 

15 

RL 

Z 

Grad 

★ 







P6 

PIO 

11 

RL 

J 

Grad 

★ 







P6 

Pll 

3 

RL 

v 

Grad 

★ 







P6 

P12 

5 

RC 

A 

RC 

A 

Grad 

•k 





Drop 


18 










El 

El 

E2 

E2 

E3 

E3 

E4 

E4 

E5 

E5 


TP 

# 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

FP 

P13 

4 

RC 

z 

RC 

z 

Grad 

•k 





Drop 

P14 

3 

RC 

J 

RC 

J 

Grad 

■k 





Drop 

P15 

2 

RC 

A 

RC 

z 

Grad 

■k 





Drop 

P16 

6 

RC 

A 

RC 

J 

Grad 

•k 





Drop 

P17 

3 

RC 

Z 

RC 

A 

Grad 

•k 





Drop 

P18 

2 

RC 

Z 

RC 

J 

Grad 

■k 





Drop 

P19 

1 

RC 

J 

RC 

A 

Grad 

■k 





Drop 

P20 

4 

RC 

A 

RL 

A 

Grad 

•k 





Drop 

P21 

1 

RC 

A 

RL 

Z 

Grad 

•k 





Drop 

P22 

1 

RC 

Z 

RL 

A 

Grad 

■k 





Drop 

P23 

1 

RC 

Z 

RL 

J 

Grad 

■k 





Drop 

P24 

2 

RC 

J 

RL 

J 

Grad 

•k 





Drop 

P25 

1 

RC 

J 

RL 

Z 

Grad 

•k 





Drop 

P26 

2 

RL 

A 

RL 

V 

Grad 

■k 





Drop 

P27 

1 

RL 

J 

RL 

V 

Grad 

■k 





P6 

P28 

3 

RL 

A 

RC 

A 

Grad 

•k 





Drop 

P29 

5 

RL 

A 

RC 

J 

Grad 

•k 





Drop 

P30 

2 

RL 

J 

RC 

A 

Grad 

■k 





Drop 

P31 

1 

RC 

A 

RC 

J 

RL 

J 

Grad 

■k 



Drop 

P32 

1 

RC 

J 

RC 

J 

RL 

J 

Grad 

■k 



Drop 

P33 

1 

RC 

A 

RL 

V 

RL 

z 

Grad 

■k 



Drop 
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El 

El 

E2 

E2 

E3 

E3 

E4 

E4 

E5 

E5 


TP 

# 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

FP 

P34 

1 

RC 

A 

RL 

A 

RC 

J 

Grad 

★ 



Drop 

P35 

1 

RC 

J 

RL 

J 

RC 

z 

RC 

J 

Grad 

★ 

Drop 

P36 

9 

Pass 

41 

Grad 

★ 







Drop 

P37 

64 

Fail 

41 

Grad 

★ 







P7 

P37A 

65 

Fail 

41 

Grad 

★ 







IE 

P38 

18 

RC 

A 

Fail 

41 

Grad 

■k 





P8 

P39 

3 

RC 

Z 

Fail 

41 

Grad 

■k 





Drop 

P40 

6 

RC 

J 

Fail 

41 

Grad 

•k 





Drop 

P41 

2 

RC 

A 

Pass 

41 

Grad 

•k 





Drop 

P42 

1 

RC 

J 

Pass 

41 

Grad 

■k 





Drop 

P43 

3 

Fail 

41 

Pass 

41 

Grad 

■k 





P7 

P44 

1 

Pass 

41 

Pass 

RC 

Z 

Grad 

★ 




Drop 

P45 

1 

Fail 

41 

Pass 

V 







Drop 

P46 

1 

RL 

A 

Pass 

41 

Grad 

■k 





Drop 

P47 

8 

RL 

A 

Fail 

41 

Grad 

■k 





Drop 

P48 

1 

RL 

Z 

Fail 

41 

Grad 

•k 





Drop 

P49 

1 

RL 

J 

Fail 

41 

Grad 

•k 





Drop 

P50 

1 

Fail 

41 

Pass 

41 

Grad 

■k 





Drop 

P51 

1 

Fail 

41 

Pass 

V 







Drop 

P52 

1 

RC 

A 

Fail 

41 

Pass 

41 

Grad 

★ 



Drop 

P53 

1 

RC 

A 

Fail 

41 

Grad 

•k 





Drop 
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J 

Medical 

V 

Personnel Action Pending 

z 

Other - Not Defined 

41 

Post DLPT-Enhancement Training 

* 

End of Input for Observation 

IE 

Input Error (Dropped from study) 

Drop 

Observations Dropped from Study 


The majority of the pipelines identified had fewer than 10 
observations in them. No noteworthy analysis could have 
been conducted on those numbers. In consultation with the 
Research and Analysis Division at the DLIFLC, these 56 
pipelines were collapsed into 8 meaningful pipelines. 
Table 3.3 gives a description of these pipelines. These 8 
pipelines contain 3,693 observations. This is 95% of the 
observations for this study. A total of 175 observations 
were dropped from consideration for this study due to the 
collapsing of pipelines. 


Table 3.3 DLIFLC Training Pipelines 


Pipeline 

Total 

Description 

PI 

2455 

On-time completion. 

P2 

278 

Recycled once due to academic difficulty. 

P3 

56 

Recycled once due to other/undisclosed 
reason. 

P4 

94 

Recycled once due to medical reasons. 
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Pipeline 

Total 

Description 

P5 

86 

Relanguaged once due to academic difficulty. 

P6 

25 

Relanguaged once due to 
other/undisclosed/medical. 

P7 

54 

DLPT enhancement training required. 

P8 

17 

Recycled once and DLPT enhancement training 
required. 


The 3,693 observations in 8 established pipelines were 
sorted again based on number of years of service upon 
entering the DLIFLC. Those students who had more than two 
years of service (328 observations) were dropped from the 
study. This was to ensure that only first-term enlistees 
were considered for the present study. This resulted in 
3,365 observations remaining for consideration. 


2. DMDC Data 

After the data provided by the DLIFLC had been sorted, 
the DMDC was requested to provide data from their databases 
for each of the 3,365 students. The data returned from the 
DMDC contained 3,253 observations. The DMDC did not have 
records for 112 students. Additionally, there were 188 
students for whom critical data were missing and who 
subsequently had to be dropped from this study. The 
remaining 3,065 observations were then sorted by Total 
Active Federal Military Service (TAFMS), LOSS DATE and TERM 
(length of contractual obligation of enlistment) and 
foreign Language Proficiency Pay (ELPP) to determine the 
post-DLIELC success of each individual. 
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TAFMS, LOSS DATE and TERM 


a. 

TAFMS is a variable that indicates the total 
military service that an individual has completed. It is 
reported in number of months. TERM refers to the 
enlistment obligation that a member has contractually 
agreed to. This number was reported in number of years. 
It was converted to months and then compared to TAFMS. 
LOSS DATE specifies the date that an individual left active 
military service. Individuals who had TAFMS less than TERM 
minus three months and had a LOSS DATE were considered to 
have attrited from the service. Those individuals that did 
not have a LOSS DATE were reviewed individually to 
determine their attrition status. The data revealed that 
1,439 individuals did not complete their contractual 
obligation. An ATTRITION variable was created and set to 
"Yes" for these individuals. Of those who successfully 
completed DLIFLC training, 47% did not finish their first- 
term obligation. 


b. FLPP 

FLPP is a variable that was determined by 
reviewing when the last payment for foreign language 
proficiency was received for each individual. If a payment 
was received fewer than six months prior to the service 
member completing his or her obligated service, then FLPP 
was set "Yes." The six month-point was assumed to be the 
point at which most individuals would lose motivation for 
preparing for and/or taking the DLPT in order to keep 
receiving FLPP. Of the 3,065 observations, 234 did not 
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meet this requirement. This equates to 8 
successfully completed DLIFLC training. 


of those who 


c. Success 

SUCCESS was a variable created from the ATTRITION 
and FLPP variables. SUCCESS will become the dependent 
variable for this study. If an individual did not attrit 
and received ELPP for the required amount of time, he or 
she was deemed successful and SUCCESS was flagged with a 
"Yes." There were 1,392 individuals that were considered a 
success. Out of 3, 065 individuals, only 45% were 

successful once they left the DLIELC. 


B. VARIABLES 

1. Dependent Variable 

The dependent variable for this study was SUCCESS. 
The possible outcomes are successful and not successful. 
These outcomes are reflected in the variable SUCCESS as 
either "Yes" or "No." Table 3.4 summarizes the dependent 
variable. 


Table 3.4 Dependent Variable Description 


Name 

Symbol 

Classification 

Description 

Success 

SUCCESS 

Categorical 

Yes, No 
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2 . 


Independent Variables 


The independent variables used are summarized in Table 
3.5. The variables in this table are categorical and the 
final set used in the logistic regression (Chapter IV) . 
They were determined by inspection of the data and by 
analyzing the results of the classification tree (Chapter 
IV). Variables that have been transformed from their 
original state are TERM.C, AFQT, DLAB and LANG. These 
transformations will be discussed in a later chapter. 


Table 3.5 Independent Variable Descriptions 


Name 

Symbol 

Classification 

Description 

Training 

Pipeline 

TRAIN.PIPE 

Categorical 

PI,P2,P3,P4,P5, 

P6,P7,P8 

Gender 

SEX 

Categorical 

M (Male) 

F (Female) 

Citizenship 

CITIZ 

Categorical 

C (Citizen) 

N (Non-citizen) 

UK (Unknown) 

Marital 

Status (at 
end of 
service) 

MARRY 

Categorical 

S (Single) 

M (Married) 

UK (Unknown) 

Ethnicity 

RETH 

Categorical 

1 (White) 

2 (Black) 

3 (Hispanic) 

5 (Asian) 

UK (Unknown) 

Term of 

Initial 

Enlistment 

TERM.C 

Categorical 

4, 5, 6 
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Education 
Level (at 
entry into 
service) 


EDUC 


Service 


Armed Eorces AEQT 
Qualification 
Test Score 


Defense 
Language 
Aptitude 
Battery Score 


Language 


DLAB 


LANG 


Native of 
English 

Native of 

Other 

Language 


NATIV.E 


NATIV.O 


Classification Description 


Categorical 


Categorical 


Categorical 


Categorical 


Categorical 


Categorical 


1 (Less than HS) 

2 (HS Diploma) 

3 (HS Equivalency) 

4 (Occupational Prgm) 

5 (Attendance Cert OP) 

6 (Attendance Cert HS) 

7 (Correspond Cert) 

8 (College-1 Semester) 

9 (Alternate Training) 

10 (Unknown) 

N (Navy) 

M (Marine Corps) 

E (Air Eorce) 

A (Army) 

A (less than 75) 

B (75 - 90) 

C (91 -99) _ 

A (90 and below) 

B (91-95) 

C (96-100) 

D (above 100) 

I (Category I Lang) 

11 (Category II Lang) 

III (Category III Lang) 

IV (Category IV Lang) 
Yes, No 


Categorical Yes, No 
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C. DESCRIPTIVE STATISTICS 

Tables 3.6 - 3.17 provide summaries of the descriptive 
statistics generated. Of particular interest are the 
success rates concerning the training pipelines, the 
individual services, gender and contract lengths and 
retention rates for each of the services. Noteworthy 
numbers are in bold. 

Table 3.6 Observations 







Total 

% of Total 

Observations Used 

3065 

79% 


Table 3.7 Service Total 



Total 

% Total 

Observations 

3065 

100% 

Army 

1307 

43% 

Air Force 

978 

32% 

Navy 

409 

13% 

Marine Corps 

371 

12% 


Table 3.8 Success 



Total 

% of Total 

Observations Used 

3065 

100% 

Attrition 

1439 

47% 


No FLPP 

234 

8% 

Success 

1392 

45% 








Total % Total 



Table 3.11 Pipeline Success 








Total 

Success 

% Success 

P6 

25 

12 

48% 

P7 

54 

19 

35% 

P8 

17 

7 

41% 


Table 3.12 Gender Success 



Total 

Success 

% Success 

Male 

1894 

912 

48% 

Female 

1171 

480 

41% 

Table 3.13 Service 

Success by Pipeline 


Total 

Success 

% Success 

Army: 




PI 

1011 

427 

42% 

P2-P8 

296 

132 

45%% 

Air Force: 




PI 

805 

337 

42% 

P2-P8 

173 

60 

34% 

Navy: 




PI 

336 

205 

61% 

P2-P8 

73 

35 

48% 

Marine Corps: 




PI 

303 

154 

51% 

P2-P8 

68 

42 

62% 
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Table 3.14 AFQT Success 


AFQT Score 

Success 

Failure 

% Success 

Less than 75 

428 

295 

59% 

75-90 

433 

642 

40% 

91-99 

531 

736 

42% 


Table 3.15 Language Category Success Rates 


Language Category 

Success 

Failure 

% Success 

I 

259 

300 

46% 

II 

2 

1 

67% 

III 

476 

532 

47% 

IV 

655 

840 

44% 


Table 3.16 Enlistment Contract Length by Service 



4 Yrs 

%4 Yrs 

5 Yrs 

%5 Yrs 

6 Yrs 

%6 Yrs 


or Less 

or Less 





Army 

293 

22% 

913 

70% 

101 

8% 

Air Force 

301 

31% 

72 

7% 

605 

62% 

Navy 

327 

80% 

35 

9% 

47 

11% 

Marine Corps 

74 

20% 

292 

79% 

5 

1% 





Table 3.17 Success Rates by Service by Length of Contract 


Contract 

Army 

Air 

Force 

Navy 

Marine Corps 

Length 

S 

F 

S 

F 

S 

F 

S 

F 

4 Yrs or Less 

170 

123 

200 

101 

198 

129 

59 

15 

%Success 

58% 

66% 

61% 

80% 

5 Year 

375 

538 

63 

9 

25 

10 

135 

157 

% Success 

41% 

8 

!8% 

71% 

46% 

6 Years 

14 

87 

134 

471 

17 

30 

2 

3 

% Success 

14% 

22% 

36% 

40% 


Table 3.18 Service Retention Rates 



Total 

Retained 

% Retained 

Army 

1307 

414 

32% 

Air Force 

978 

273 

28% 

Navy 

409 

235 

57% 

Marine Corps 

371 

87 

23% 


1. Training Pipelines 

This study set out to determine whether there were 
significant differences in the success rates of individuals 
in each training pipeline. The first and most interesting 
statistic generated concerned training pipeline PI. 
Training pipeline PI was considered the base pipeline. 
This pipeline consisted of those individuals who 
successfully completed their course work and passed the 
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DLPT as originally assigned (on time) . There were 2,455 
individuals who in pipeline PI. Only 1,123, or 46%, were 
subsequently successful. Though there were no 
statistically significant differences (0.05 level) when 
comparing Pi's success rate to those of the other 
pipelines, because of the low number of observations in the 
other pipelines, it is difficult to assert any significance 
to this finding. Table 3.19 gives the z-statistic from the 
two-sample test of proportions and the associated p-values 
for the different pipelines. It is interesting to note that 
the success rate for pipeline P2 (recycle once due to 
academic trouble) is essentially the same as Pi (46%), but 
the success rate for pipeline P5 (relanguage once due to 
academic trouble) is noticeably higher (53%). Though there 
was no statistically significant difference (p = 0.20) 
there still appears to be a large enough spread between the 
success rates to believe that a difference may actually 
exist. This assertion is made due to the weakness in the 
statistical test used caused by the low number of 
observations in the pipeline. A possible explanation is 
that those individuals that relanguage once are more 
motivated (satisfied) with their new language (presuming no 
academic difficulty in the second language) and this 
indicates that they may have been more satisfied with their 
career after leaving the DLIFLC. Those individuals who 
recycled once into the same language in which they had 
academic difficulty may not have been as satisfied in their 
career due to this difficulty. 
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Table 3.19 Training Pipeline Statistical Inference 



% Success 

z-statistic 

p-value 

PI 

46% 



P2 

46% 

0 

0.9602 

P3 

36% 

1.49 

0.1362 

P4 

40% 

1.15 

0.2502 

P5 

53% 

1.28 

0.2006 

P7 

35% 

1.61 

0.1070 


Due to the low number of observations for P6 and P8, no 
inferential test was conducted. 


2 . Service Breakdown 

There is a statistically significant difference among 
the services in regard to success rates. The success rate 
for the Navy (59%) is considerably higher than the other 
services. The Marine Corps (53%) has an appreciably higher 
rate than the Air Force (40%) and Army (42%) . The Air 
Force has the lowest success rate. Using the Navy as the 
baseline, the different success rates are statistically 
significant for the Army and Air Force at the 0.05 level 
and for the Marine Corps at the 0.10 level. This suggests 
that service affiliation may be associated with success. 
Table 3.20 gives the z-statistic and p-values for the 
success rates. These statistics are especially significant 
for the Army and the Air Force. These two services account 
for 75% of all observations. 
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Table 3.20 Service Success Rates 



% Success 

z-statistic 

p-value 

Navy 

59% 



Army 

43% 

5.66 

0.00* 

Air Force 

41% 

6.13 

0.00* 

Marine Corps 

53% 

1.69 

0.09** 


* Significant at all levels. ** Significant at the 0.10 level. 


3. Contract Lengths and Service Success Rates 

When analyzing success rates by contract length, it is 
immediately apparent that those individuals with contract 
lengths of five and six years have a considerably lower 
success rate (37%) than those with contract lengths of four 
years or less (63%) . Dissecting this data further reveals 
that the majority of Navy contracts are four years or less. 
Marine Corps contracts are five years. Air Force contracts 
are six years and Army contracts are five years. The non- 
Navy groups account for some of the lowest success rates 
for each of the services (Marine Corps 46%, Army 41%, Air 
Force 22%) . All services have success rates above 50% for 
individuals with contracts of four years or less and all 
services have success rates equal to or below 40% for 
individuals with six-year contracts. The highest success 
rate was for the Air Force (88%) for five-year contracts. 
It appears that contract lengths are associated with 
success rates and that there might be some interaction 
between service and contract length. Figure 3.1 

graphically displays the success rates by service by 
contract length. 
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Success Rates by Service by Length of Enlistment 
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Figure 3.1 Success Rates by Service by Enlistment Length 


4. Gender 

There were 1, 894 males within this study. Out of 
those 1,894, only 912 or 48% were successful once they left 
DLIFLC. There were 1,171 females within the study; only 
480 or 41% of those were subsequently successful after 
DLIFLC. This difference of 7% suggests that gender may be 
associated with success. Table 3.20 gives the z-statistic 
and p-value for success rates using males as the baseline. 


Table 3.21 Gender Success Rates 



% Success 

z-statistic 

p-value 

Male 

48% 



Female 

41% 

3.78 

0.00* 


*Significant at all levels. 
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5. 


AFQT 


AFQT scores reveal an intriguing finding. Contrary to 
what other studies have found, this study has found that 
lower AFQT scores are related to success. Those 
individuals with AFQT scores below 75 had the highest 
success rates (59% versus 40% for scores between 75-90 and 
42% for scores between 91-99) . The differences in success 
rates are statistically significant. Table 3.21 gives the 
z-statistic and p-value for the success rates. Looking 
more closely at the data reveals that the AFQT scores below 
75 range from 0-74. Over half the scores in this range are 
0. AO indicates that the AFQT score is missing for this 
observation. Because of this finding it is difficult to 
attribute any meaning to the previous findings. 


Table 3.22 AFQT Success Rates 



% Success 

z-statistic 

p-value 

Less than 75 

59% 



75-90 

40% 

8.12 

0.00* 

91-99 

42% 

7.33 

0.00* 


Significant at all levels. 


6. Retention 

Though retention rates are not the emphasis of this 
study, the statistics are easily derived and important for 
all services. The statistics are critical because the more 
people who remain in service past their obligated service, 
the less money has to be spent on recruiting and training 
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people to replace them. The Marine Corps (24%), Air Force 
(28%) and Army (31%) all had retention rates noticeably 
lower than that of the Navy (58%) . By inspection, it can 
be determined that these retention rates are considerably 
different. Of particular importance in this discussion is 
the fact that the majority of enlistment contracts for the 
Navy were four years, yet the Navy is still getting five or 
more years of service on average out of these individuals. 


D. SUMMARY 

Though the success rates for the different pipelines 
were not found to be statistically significant the 
pipelines may still have some effect on success. Low 
numbers of observations in most pipelines do not make the 
inferential statistics very meaningful. Service 

affiliation, gender, AFQT and contract length all appear to 
have an effect on success rates. Previous studies have 
shown these factors to have effects on service attrition, 
although this study shows the relationship of AFQT scores 
and success to be opposite of what previous studies 
indicate. The following chapter will describe the analysis 
of these and other factors and the conclusions drawn from 
them. 
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IV. METHODOLOGY, ANALYSIS AND RESULTS 


The results for the classification tree and logistic 
regression analyses for post-DLIFLC success are contained 
in this chapter. The methodology and evaluation of results 
are described in the following sections. 

A. CLASSIFICATION TREE 

1. Methodology 

A classification tree is a statistical method used to 
predict the state to which an observation is most likely to 
belong. The method is termed a "tree" because the graph 
appears as an upside-down tree. The first node in the tree 
is the "root node." It is split into two nodes which are 
then split into three or four nodes. Each node is an 
independent variable being split. This process continues 
until predetermined limits of splitting are achieved. The 
general procedure for classification trees is at every 
opportunity to split, the split that maximizes the node 
purity is used. This is accomplished through the algorithm 
searching through all independent variables and evaluating 
all possible splits and determining the split that would 
minimize the deviance (twice the log-likelihood) of that 
node. (Montgomery, et al. , 2001) A large data set, with 
many predictor variables, would develop into a very complex 
tree. Although this is the tree with the maximized node 
purities, it is not the optimal design. The optimal tree 
is determined through cross-validation and pruning. 


41 



Cross-validation is a method that optimizes both the 
purity of the tree and its ability to accurately classify 
new data. One method of cross-validation is to partition 
the data into ten nearly equal size sets. Each set is 
withheld in turn and the remaining nine sets are used to 
grow a tree. A tree is grown to its maximum size and then 
pruned back to the root node. The tree is pruned in a 
manner that tries to maximize the purity at the new number 
of splits. The minimum deviance of each size tree for the 
ten trials is found. Next, an evaluation of the penalized 
deviance, a weighted sum of the minimum deviance and the 
number of leaves in the tree, is conducted. There is a 
point in growing trees where the size of a tree is so large 
that it loses its predictive power and the penalized 
deviance begins to increase. Cross-validation plots give 
an idea as to the optimal size of a tree to minimize the 
penalized deviance and to allow for proper pruning. 
Pruning limits the size of the tree. The optimal size for 
the tree determined from cross-validation is used to grow 
the new tree. 

The S-Plus 6.2 (statistical software package) 
functions for building, cross-validating and pruning trees 
were used to build a classification tree for this study. A 
classification tree was used for this study to help in 
determining the structure of the data. Some of the 
independent variables had many levels and two independent 
variables were continuous, DLAB and AFQT. The 
classification tree helped in collapsing the number of 
levels in some categorical variables and in establishing 
levels for transforming the continuous variables into 

categorical ones. Additionally, the classification tree 
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helped in determining whether there were differences in the 
training pipelines since statistically they were determined 
not to be significantly different. 

All 3, 065 observations were used in the building of 
the classification tree. All independent variables were 
used in the initial process of building the tree. After 
initial analysis, cross-validation and pruning, the final 
classification tree was determined. The following section 
of this chapter contains the results of the tree. 

2. Analysis and Results 

The first classification tree built used the RETH 
variable for the first split. The split was based on the 
level "Unknown" and all others. RETH was not split again 
in the cross-validated and pruned tree. It was determined 
not to use RETH as a variable in building the tree or the 
logistic regression. Splitting only on the level "Unknown" 
contained no useful information about the data. All other 
variables (Table 3.5) were retained for consideration in 
the classification tree. 

The classification tree was allowed to grow to its 
full size; then, cross-validation and pruning were used to 
find the optimal size. Eigure 4.1 shows the cross- 
validation plot for the data. It was determined through 
evaluation of this plot that the optimally-sized tree 
contained approximately 15 leaves. The pruned tree for 
this data is presented in Eigure 4.2. The tree shows that 
the variables CITIZ, TERM.5, SVC, AEQT, LANG, EDUC, 
TRAIN.PIPE, MOTIV and DLAB were used to grow this tree. 
The misclassification rate for this tree was determined by 
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counting the misclassified observations in each terminal 
node and dividing by the total number of observations. The 
misclassified observations are represented by the numbers 
underneath each leaf of the tree and to the left of the 
slash mark. The misclassif icat ion rate for this tree was 
31% which is better than the 45% misclassif icat ion rate of 
the root node. 



Figure 4.1 Cross-Validation Plot 


The tree identified citizenship early as an important 
factor for success. The split was on "Citizen"-"Non- 
Citizen" and unknown. This does not provide useful 
information, but was left in the model due to the fact that 







Figure 4.2 Classification Tree 

The ovals represent non-terminal nodes; the rectangles 
represent terminal nodes. The number in each oval and 
rectangle represent the classification of the observations 
in that node (l=Success, 0=Failure) . The numbers below the 
nodes are the number misclassified and the total number of 

observations in the node. 




"Citizen" and "Non-Citizen" were split two levels below. 
1,714 observations were split at this level; therefore this 
variable was deemed to be important to the model and left 
in. 

Other factors that were deemed important in the model 
included: TERM.5, AFQT, LANG and TRAIN.PIPE. Though other 
variables were deemed important, these are of special 
interest. TERM.5 was identified after the variable CITIZ 
as being significant to the model. For this split, those 
enlistees who had a contract of four years or less had a 
success rate of 71% compared to a 45% success rate for 
those with service contracts greater than four years. This 
confirms what the descriptive statistics suggested in 
earlier chapters; that contract length is a central factor 
in success rates for enlistees. After additional 
evaluation of contract lengths, it was determined that a 
variable for contract lengths with three levels (four-year 
contracts, five-year contracts and six-year contracts) will 
be used for contract length (TERM.C). Individuals with 
contract lengths of two and three years were grouped with 
those who had four-year contracts. This was done because 
there were only 45 observations with two-year or three-year 
contracts. The variable TERM.C provided the best 
descriptive statistics (Chapter III) and logistic 
regression model (next section). Additionally, the only 
noteworthy split for AFQT occurred at 74.5. This led to 
using 75 to create levels within the AFQT variable, making 
it categorical. The variable LANG did not appear to split 
in any interpretable manner. When proceeding to the 
logistic regression, the variable was collapsed into four 

levels based on the category of difficulty determined by 
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the DLIFLC. Finally, TRAIN.PIPE was identified as 
important only in the last level of splits. This is worth 
mentioning, because it suggests that TRAIN.PIPE may not be 
useful during a logistic regression. Overall the model 
provides a good first look at the factors that affect post- 
DLIELC success. 

B. LOGISTIC REGRESSION 

1. Methodology 

Regression is a technique used to estimate the 
relationship between a set of independent variables called 
predictors to a dependent variable called a response 
variable. Logistic regression is the technique applied to 
binary response variables to find the probability that an 
observation falls into one of the two categories of the 
response variable. It uses the method of maximum 
likelihood to produce estimates of the coefficients for 
each independent variable in order to produce a prediction. 
(Hosmer and Lemeshow, 1989) By establishing a threshold 
for the predicted value, the predicted value can be 
classified into one of two response categories. The usual 
threshold is 0.50. By comparing the observation with the 
predicted value, the model can be evaluated for its 
usefulness. 

This study looks at the binary variable "SUCCESS" as 
the response variable and independent variables that are 
all categorical with varying numbers of levels. One level 
of the independent variable was chosen as the default 
level; k levels were then replaced with k-1 variables. The 
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default levels for each independent variable are listed in 


table 4.1. The S-Plus 6.2 logistic regression function was 
used to fit a model using all main effects of the 
independent variables. The model was limited to main 
effects because of the computational complexity of adding 
second order and above interactions. After building the 
model with all independent variables, analysis of deviance 


Table 4.1 Default Independent Variable Levels 


TRAIN.PIPE 

Pipeline 1 

SEX 

Female 

CITIZ 

Citizen 

TERM 

4-Year Contract 

SVC 

Army 

AFQT 

Less than 75 


was used to determine which variables were significant in 
predicting the response variable. (Hosmer and Lemeshow, 
1989) Using the droptermO function from the S-Plus MASS 
library, each independent variable was evaluated based on 
its significance to the model. If a variable was 

determined to be not significant it was dropped from the 
model and the model was fit again without that variable. 
The difference in deviance between the model with and the 
model without the variable was compared to a chi-square 
distribution. If this statistic was not significant, the 
new model was kept. This process was repeated until no 
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more variables could be deleted from the model. This 
process produced a final model. 

After the final model had been generated, it was 
evaluated for its "goodness-of-fit." "Goodness-of-fit" was 
analyzed through analysis of deviance and the Hosmer- 
Lemeshow test. (Hosmer and Lemeshow, 1989) A "rule of 
thumb" was used for analysis of deviance to get an idea of 
the model before proceeding to the Hosmer-Lemeshow test. 
This suggested that if the residual deviance is 
approximately equal to n-p degrees of freedom the model is 
adequate in predicting the response variable. After 
determining the model was adequate, the Hosmer-Lemeshow 
test was conducted. This test sorts the predictions into g 
groups based on percentiles of estimated probability. In 
each group the number of good responses and the sum of the 
predicted probabilities are computed. A table of observed 
and expected frequencies is developed from the previous 
computations. Next a "C" statistic is calculated using the 
Pearson chi-square statistic from the table of observed and 
estimated expected frequencies. This statistic is 
approximated to the chi-square distribution with g-2 
degrees of freedom, where g was taken to be 10. If the p- 
value computed from the chi-square distribution is not 
significant, the model is deemed to fit well. (Hosmer and 
Lemeshow, 1989) 

After evaluating the "goodness-of-fit" and determining 
the final model composition, interpretation of the results 
took place. Part of this interpretation involved 
calculating the odds ratios and confidence intervals for 
each independent variable. 
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2 . 


Analysis and Results 


The logistic regression model for the DLIFLC data 
involved analyzing all 3,065 observations and independent 
variables. Using analysis of deviance and the droptermO 
function in S-Plus, the model was pared down to just 6 
independent variables (with multiple levels). The final 
model's "goodness-of-fit" was evaluated using the Hosmer- 
Lemeshow test to ensure model adequacy. Table 4.2 displays 
the results for this model. This table includes the 
variables, estimated coefficients, standard errors, t- 
values, odds ratios and confidence intervals for each of 
the odds ratios in the final model. Variables of 
significance (confidence intervals that do not contain 1) 
are highlighted. 

The variables TRAIN.PIPE (2,3,4,5,6,8) did not appear 
to be different than TRAIN.PIPE (I) and Navy did not appear 
to be different than the Army. Of interest in this model 
were the variables TRAIN.PIPE (7), SEX, SVC (Air Eorce and 
Marine Corps) and AEQT. Evaluating each of these variables 
individually while holding all others constant provides 
valuable insight into how these variables affect success. 
TRAIN.PIPE (7) was the only pipeline that was marginally 
significant in this model. The model suggests that 
individuals that successfully complete the DLIELC through 
TRAIN.PIPE (7) have 0.28-0.99 the odds of success as those 
who complete through TRAIN.PIPE (1) . This is important 
because TRAIN.PIPE (7) is the pipeline that contains 
individuals that required post-DLPT enhancement training. 
Within the SEX variable males were shown to have a higher 
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Table 4.2 Logistic Regression Model for Post-DLIFLC Success 


Coefficients 

Estimate 

Error 

t-value 

Odds 

Ratios 

Cl 

Lower 

Cl 

Higher 

Intercept 

0.84 

0.17 

5.08 




Pipeline 2 

-0.16 

0.14 

-1.11 

0.85 

0.65 

1.12 

Pipeline 3 

-0.24 

0.32 

-0.77 

0.79 

0.42 

1.47 

Pipeline 4 

0.19 

0.27 

0.70 

1.21 

0.71 

2.05 

Pipeline 5 

0.27 

0.25 

1.09 

1.31 

0.80 

2.14 

Pipeline 6 

0.21 

0.52 

0.41 

1.23 

0.45 

3.42 

Pipeline 7 

-0.64 

0.32 

-1.99 

0.53 

0.28 

0.99 

Pipeline 8 

-0.14 

0.54 

-0.25 

0.87 

0.31 

2.51 

Male 

0.32 

0.09 

3.54 

1.38 

1.15 

1.64 

Non-Citizen 

-0.93 

0.30 

.3.12 

0.39 

0.22 

0.71 

Unknown-Citizen 

-5.14 

0.51 

-10.18 

0.01 

0.00 

0.02 

5-Year Term 

-0.82 

0.12 

-7.15 

0.44 

0.35 

0.56 

6-Year Term 

-2.27 

0.16 

-14.34 

0.10 

0.08 

0.14 

Air Force 

1.06 

0.17 

6.28 

2.89 

2.07 

4.03 

Marine Corps 

0.08 

0.12 

0.67 

0.86 

1.08 

1.37 

Navy 

-0.01 

0.15 

-0.06 

0.99 

0.74 

1.33 

AFQT (75-90) 

-0.41 

0.13 

-3.13 

0.66 

0.51 

0.86 

AFQT (91-99) 

-0.40 

0.13 

-3.03 

0.67 

0.52 

0.86 


probability of success. The confidence interval for males 
suggests that males have a 1.15-1.64 times greater 
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predicted odds of success. This corresponds to the 
inferential statistics presented in Table 3.20. The AFQT 
variable was interesting in that in this model it was shown 
that individuals with an AFQT score below 75 had a higher 
probability of success than those above 75. For 
individuals with AFQT scores between 75-90, their predicted 
odds of success was 0.51-0.86 of those below 75. For 
individuals with AFQT scores between 90-99, their predicted 
odds of success was 0.52-0.86 of those below 75. After 
looking more closely at the AFQT variable (Chapter III), no 
meaning can be drawn from the analysis concerning AFQT. 
Over half the observations with AFQT scores below 75 were 
0. AO would indicate that the AFQT score for that 
individual was missing. The variable was left in the model 
because the model was determined to have better "goodness- 
of-fit" with the variable than without it. Finally, the 
SVC variable provides valuable information. It was found 
that individuals in the Air Force had 2.07-4.03 times the 
predicted odds of success and the Marine Corps had a 1.08- 
1.37 times the predicted odds of success compared to the 
Army. The model does not suggest any difference for the 
Navy (confidence intervals that contain 1). At first 
glance this information seems to be in opposition to the 
statistics developed in Chapter III. Table 3.19 suggests 
that the Navy has a significantly higher percentage of 
success. Tables 3.15 and 3.16 reveal that the Navy has the 
highest percentage of contracts for four years and the Air 
Force has the highest percentage of contracts that are for 
six years. The Navy, Air Force and Marine Corps all have 
high success rates for four-year contracts and all services 
have low success rates for six-year contracts. The Air 
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Force has a higher percentage of success for four-year 
(66%) and five-year (88%) contracts compared to the Navy 
(61% and 71% respectively) . The Marine Corps has a higher 
percentage of success for both four-year (80%) and six-year 
(40%) contracts compared to the Navy (61% and 36% 
respectively) . The Marine Corps also has the highest 
percentage of success for six-year contracts. The Navy's 
success rates are lower in two out of three categories of 
contracts compared to the Air Force and Marine Corps. With 
this information, the model makes intuitive sense. Overall 
the model begins to determine what factors are influential 
in determining post-DLIFLC success. Combined with the 
classification tree model it provides a good look at these 
relationships. 


C. SUMMARY 

This chapter presented the models developed to try to 
predict success for an individual after graduating from the 
DLIFLC. Classification trees and logistic regression were 
used. 

The classification tree that was used provided a base 
to begin the logistic regression. In particular it 
provided a threshold for AFQT scores in order to develop 
levels within the variable. It also revealed that 
TRAIN.PIPE, the primary variable of concern at the 
beginning of this study, was only slightly influential. 
This was reflected in the inferential statistics developed 
and was again reinforced with the logistic regression. 
Overall, the classification tree provided a good reference 
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for those variables that are influential in predicting 
success. 

The last step in the analysis of the data collected 
was in developing and analyzing a logistic regression. 
After successfully paring down the variables using analysis 
of deviance and ensuring "goodness-of-fit" using the 
Hosmer-Lemeshow test, a final model was established. This 
model showed that the variables TRAIN.PIPE (7), SEX, AEQT 
and SVC (Air Eorce and Marine Corps) were important in 
predicting success after successful completion of DLIELC. 
Though no conclusions can be drawn concerning AEQT, it was 
left in the model because the model was determined to have 
better "goodness-of-fit" with the variable than without it. 
SVC was particularly interesting in that it revealed 
information that was not known or was not apparent at the 
beginning of this study. Though the Navy had higher 
percentages of success, the Air Force and Marine Corps were 
better predictors of success within SVC. The model 
developed provided valuable and useful information and can 
be used in the future to jumpstart further research. 


54 



V. 


SUMMARY, CONLCUSIONS AND RECOMMENDATIONS 


A. SUMMARY AND CONLUSIONS 

This study attempted to analyze various personnel, 
military and academic attributes of students who had 
graduated from the DLIFLC, in order to determine the 
effects these factors had on success after DLIFLC. Chapter 
I opened with an introduction of this study and background 
information on the DLIFLC and the students who are enrolled 
at the institute. Chapter II described various other 
studies that identified variables shown to have an effect 
on successful completion of DLIFLC training and on first- 

term enlisted attrition. These variables were used as 

guides and starting points for this study. Chapter III 
gave a description of the data used for this study along 
with the development of numerous descriptive and 
inferential statistics. Chapter IV included the 

classification tree and logistic regression model's 
analysis and results developed for this study. 

Data were gathered on first-term enlistees in the 
Army, Air Force, Marine Corps and Navy who entered the 
DLIFLC between fiscal years 1997-2000 and who graduated. 
The DLIFLC and DMDC provided information pertaining to the 
students' personnel, military and academic backgrounds. 
Out of 56 training pipelines enumerated in the data 
gathered from the DLIFLC, 8 were used. The remaining 

pipelines did not contain enough cases to make any analysis 
meaningful. After sorting through all the data from the 

DLIFLC and DMDC, a total of 3,065 observations were 
considered for use in this study. 
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To have been considered a graduate of the DLIFLC, a 
student had to successfully complete his or her course of 
study and meet the requirement of a 2/2/1+ in listening, 
reading and speaking on the DLPT. Post-DLIFLC success was 
defined as an individual completing his or her contractual 
obligation and maintaining his or her language proficiency 
during their initial tour of duty. A threshold of up to 
three months prior to the end of contractual obligation was 
used for completion of service due to the fact that 
services often allow service members to leave the service 
prior to the end of their contract. It is acknowledged 
that three months may or may not be the best threshold; 
however, nothing in the data suggested that it was not 
adequate. Maintaining language proficiency was measured by 
continuation of FLPP up to six months prior to end of 
service. This threshold was established due to the 
assumption that the amount of FLPP would not be adequate in 
convincing an individual to put the time and effort into 
obtaining the minimum requirements on the DLPT for the 
remaining six months. FLPP was considered adequate in 
encouraging language proficiency prior to that threshold. 
Again, this threshold may or may not be the best choice, 
but was deemed adequate by all parties involved in this 
study. After defining success at the DLIFLC and post- 
DLIFLC success, the data revealed that only 63% of first- 
term enlistees who entered the basic program at the DLIFLC 
between 1997-2000 graduated. More surprising is the fact 
that, of those individuals that graduated from the DLIFLC, 
only 45% were subsequently successful. 47% of those who 
graduated from the DLIFLC attrited before completing their 
term of enlistment. The remaining 8% did not maintain 
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their language proficiency. 47% attrition is troublesome 
considering the amount of time and money that has been 
invested in each of these individuals by their services. 

Chapter III discussed the descriptive statistics that 
were developed during this study. The statistics describing 
service success rates, contract lengths and success rates 
by contract length and service are all particularly 
important. Just looking at service success rates, the Navy 
(59%) has the highest success rates followed by the Marine 
Corps (53%) , the Army (43%) and the Air Force (41%) . This 
would suggest that service affiliation has an effect on 
success. Looking further into the data revealed that the 
Navy had the highest percentage of four year or less 
contracts, the Army had the highest percentage of five-year 
contracts and the Air Force had the highest percentage of 
six-year contracts. All services had success rates above 
50% for four or fewer years and all services had success 
rates equal to or below 40% for six-year contracts. In two 
out of three contract length categories the Air Force has a 
higher percentage of success than the Army or Navy. The 
Marine Corps has a higher percentage in all three contract 
categories than the Army and in two out of three than the 
Navy. This is important in that it shows overall service 
success rates can be misleading because the proportion of 
contract lengths among each of the services vary. Looking 
at contract length categories by service gives a better 
idea how the services compared and it shows how important 
contract lengths are to success. 

After developing descriptive statistics, a model using 
classification trees was developed using S-Plus 6.2 
software. This model was interesting in that it provided a 
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threshold for AFQT scores in order to develop levels within 
the variable to transform it from numeric to categorical. 
It also identified CITIZ, TERM. 5, SEX, SVC, AEQT, LANG, 
DLAB, EDUC, MOTIV and TRAIN.PIPE as influential in 
predicting success. However, TRAIN.PIPE was identified as 
only slightly significant as determined by how far down the 
tree it appeared and that it was used in only one split. 
Due to no discernible thresholds breaking out within the 
LANG and DLAB variables, the levels were established 
consistent with the DLIELC requirements and 
classifications. The DLAB minimum proficiency cut-offs 
were used to determine levels within the DLAB variable. 
The DLIELC classification of languages (I, II, III, IV) 
were used as levels within the LANG variable. The results 
of the classification tree helped in understanding the data 
and in beginning to develop a logistic regression model. 
The results were also important in that both the 
classification tree and logistic regression model found 
common variables as influential. 


Using the 

threshold established 

for 

AEQT, 

LANG 

and 

DLAB and the 

new 

contract 

length 

variable 

TERM.C, 

a 

logistic regression 

model was 

developed. 

This 

model 

was 


evaluated by way of analysis of deviance and its "goodness- 
of-fit" by way of the Hosmer-Lemeshow test. The final 
model used TRAIN.PIPE, SEX, CITIZ, TERM.C, SVC, and AEQT. 
Except for TERM.C, these variables appear in the 
classification tree as well. TERM.5 (a collapsed version 
of TERM.C) was used in the classification tree. The 
variable LANG did not appear in the final model. Language 
difficulty did not appear to have an effect on post-DLIELC 
success. 
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This model showed that each of the variables retained 
had one or more levels that were significant in predicting 
the odds of success. Individuals that graduated from the 
DLIFLC through training pipeline 7 were less likely to be 
successful than those who graduated through training 
pipeline 1. Males were more likely to be successful; U.S. 
citizens were also more likely to be successful. Overall, 
individuals with five-year or six-year contracts were 
significantly less likely to succeed than those with 
contracts of four years or fewer years. The Air Force and 
Marine Corps were both more likely to succeed than the 
Army. Though AFQT scores were shown to be significant, no 
conclusions can be drawn from the analysis because of the 
fact that over half the observations below 75 were missing 
an AFQT score. This model, in concert with the 
classification tree, provides a good preliminary first look 
at these data and what they can reveal about post-DLIFLC 
success. This model and study should be expanded upon to 
try and ascertain more meaning to the results and to ensure 
the results remain consistent for other groups of DLIFLC 
graduates. 

B. RECOMMENDATIONS 

This study was the first to try and capture an 
individual's probability of success after successfully 
completing DLIFLC training. The analysis and results have 
provided a good glimpse into what characteristics begin to 
help in explaining an individual's post-DLIFLC success. 
Now that this study has been completed, follow-on research 
can use it as a starting point to more fully explore the 
characteristics established here and in developing ideas 
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for other characteristic that were missed. One 
recommendation is to include more in-depth analysis of the 
characteristics established as influential in this study 
especially the influence of gender, service affiliation, 
contract lengths and AFQT scores. In particular, 
interactions among the independent variables should be 
considered. Additionally, job assignment after the DLIFLC 
training should be reviewed to determine its importance in 
predicting success. Also, it is recommended that economic 
factors that could influence an individual's success in the 
military (unemployment rate, civilian career opportunities 
for language skills, etc) should be reviewed. Finally, 
research should be completed on FLPP to try to discover 
whether the levels of compensation have influence on an 
individual's success post-DLIFLC. 
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APPENDIX A. NON-GRADUATE TRAINING PIPELINES 


Table A.l gives a brief description of the basic 
pipelines traversed by these individuals. Out of 6162 
observations, 2294 did not graduate for academic, 
administrative and various other reasons. This equates to 
37% of those that started DLI. Table A. 2 describes the 
codes used in Table A.l. 


Table A.l Non-Graduate Training Pipelines 


TP 

# 

OC 

Code 

OC 

Code 

OC 

Code 

OC 

Code 

FI 

1107 

NG 

A 







F2 

283 

NG 


NG 

A 





F3 

30 

NG 


NG 

A 





F4 

620 

G 


DLPT 

A 





F5 

230 

NG 


G 


DLPT 

A 



F6 

23 

NG 


G 


DLPT 

A 



F7 

1 

NG 


NG 


G 


DLPT 

A 


Table A.2 Explanation of Failure Pipeline Coding 


TP 

Training Pipeline 

# 

Number in Training Pipeline 
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